54 research outputs found
Constructive Preference Elicitation over Hybrid Combinatorial Spaces
Preference elicitation is the task of suggesting a highly preferred
configuration to a decision maker. The preferences are typically learned by
querying the user for choice feedback over pairs or sets of objects. In its
constructive variant, new objects are synthesized "from scratch" by maximizing
an estimate of the user utility over a combinatorial (possibly infinite) space
of candidates. In the constructive setting, most existing elicitation
techniques fail because they rely on exhaustive enumeration of the candidates.
A previous solution explicitly designed for constructive tasks comes with no
formal performance guarantees, and can be very expensive in (or unapplicable
to) problems with non-Boolean attributes. We propose the Choice Perceptron, a
Perceptron-like algorithm for learning user preferences from set-wise choice
feedback over constructive domains and hybrid Boolean-numeric feature spaces.
We provide a theoretical analysis on the attained regret that holds for a large
class of query selection strategies, and devise a heuristic strategy that aims
at optimizing the regret in practice. Finally, we demonstrate its effectiveness
by empirical evaluation against existing competitors on constructive scenarios
of increasing complexity.Comment: AAAI 2018, computing methodologies, machine learning, learning
paradigms, supervised learning, structured output
Decomposition Strategies for Constructive Preference Elicitation
We tackle the problem of constructive preference elicitation, that is the
problem of learning user preferences over very large decision problems,
involving a combinatorial space of possible outcomes. In this setting, the
suggested configuration is synthesized on-the-fly by solving a constrained
optimization problem, while the preferences are learned itera tively by
interacting with the user. Previous work has shown that Coactive Learning is a
suitable method for learning user preferences in constructive scenarios. In
Coactive Learning the user provides feedback to the algorithm in the form of an
improvement to a suggested configuration. When the problem involves many
decision variables and constraints, this type of interaction poses a
significant cognitive burden on the user. We propose a decomposition technique
for large preference-based decision problems relying exclusively on inference
and feedback over partial configurations. This has the clear advantage of
drastically reducing the user cognitive load. Additionally, part-wise inference
can be (up to exponentially) less computationally demanding than inference over
full configurations. We discuss the theoretical implications of working with
parts and present promising empirical results on one synthetic and two
realistic constructive problems.Comment: Accepted at the Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI-18
Statistical Relational Learning for Proteomics: Function, Interactions and Evolution
In recent years, the field of Statistical Relational Learning (SRL) [1, 2] has
produced new, powerful learning methods that are explicitly designed to solve
complex problems, such as collective classification, multi-task learning and
structured output prediction, which natively handle relational data, noise,
and partial information. Statistical-relational methods rely on some First-
Order Logic as a general, expressive formal language to encode both the data
instances and the relations or constraints between them. The latter encode
background knowledge on the problem domain, and are use to restrict or bias
the model search space according to the instructions of domain experts. The
new tools developed within SRL allow to revisit old computational biology
problems in a less ad hoc fashion, and to tackle novel, more complex ones.
Motivated by these developments, in this thesis we describe and discuss the
application of SRL to three important biological problems, highlighting the
advantages, discussing the trade-offs, and pointing out the open problems.
In particular, in Chapter 3 we show how to jointly improve the outputs
of multiple correlated predictors of protein features by means of a very gen-
eral probabilistic-logical consistency layer. The logical layer — based on
grounding-specific Markov Logic networks [3] — enforces a set of weighted
first-order rules encoding biologically motivated constraints between the pre-
dictions. The refiner then improves the raw predictions so that they least
violate the constraints. Contrary to canonical methods for the prediction
of protein features, which typically take predicted correlated features as in-
puts to improve the output post facto, our method can jointly refine all
predictions together, with potential gains in overall consistency. In order
to showcase our method, we integrate three stand-alone predictors of corre-
lated features, namely subcellular localization (Loctree[4]), disulfide bonding
state (Disulfind[5]), and metal bonding state (MetalDetector[6]), in a way
that takes into account the respective strengths and weaknesses. The ex-
perimental results show that the refiner can improve the performance of the
underlying predictors by removing rule violations. In addition, the proposed
method is fully general, and could in principle be applied to an array of
heterogeneous predictions without requiring any change to the underlying
software.
In Chapter 4 we consider the multi-level protein–protein interaction (PPI)
prediction problem. In general, PPIs can be seen as a hierarchical process
occurring at three related levels: proteins bind by means of specific domains,
which in turn form interfaces through patches of residues. Detailed knowl-
edge about which domains and residues are involved in a given interaction has
extensive applications to biology, including better understanding of the bind-
ing process and more efficient drug/enzyme design. We cast the prediction
problem in terms of multi-task learning, with one task per level (proteins,
domains and residues), and propose a machine learning method that collec-
tively infers the binding state of all object pairs, at all levels, concurrently.
Our method is based on Semantic Based Regularization (SBR) [7], a flexible
and theoretically sound SRL framework that employs First-Order Logic con-
straints to tie the learning tasks together. Contrarily to most current PPI
prediction methods, which neither identify which regions of a protein actu-
ally instantiate an interaction nor leverage the hierarchy of predictions, our
method resolves the prediction problem up to residue level, enforcing con-
sistent predictions between the hierarchy levels, and fruitfully exploits the
hierarchical nature of the problem. We present numerical results showing
that our method substantially outperforms the baseline in several experi-
mental settings, indicating that our multi-level formulation can indeed lead
to better predictions.
Finally, in Chapter 5 we consider the problem of predicting drug-resistant
protein mutations through a combination of Inductive Logic Programming [8,
9] and Statistical Relational Learning. In particular, we focus on viral pro-
teins: viruses are typically characterized by high mutation rates, which allow
them to quickly develop drug-resistant mutations. Mining relevant rules from
mutation data can be extremely useful to understand the virus adaptation
mechanism and to design drugs that effectively counter potentially resistant
mutants. We propose a simple approach for mutant prediction where the in-
put consists of mutation data with drug-resistance information, either as sets
of mutations conferring resistance to a certain drug, or as sets of mutants with
information on their susceptibility to the drug. The algorithm learns a set
of relational rules characterizing drug-resistance, and uses them to generate
a set of potentially resistant mutants. Learning a weighted combination of
rules allows to attach generated mutants with a resistance score as predicted
by the statistical relational model and select only the highest scoring ones.
Promising results were obtained in generating resistant mutations for both
nucleoside and non-nucleoside HIV reverse transcriptase inhibitors. The ap-
proach can be generalized quite easily to learning mutants characterized by
more complex rules correlating multiple mutations
- …